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£J ■ Abstract 

The contraction properties of the Extended Kalman Filter, viewed as a deterministic 
observer for nonlinear systems, are analyzed. This yields new conditions under which 
£f} , exponential convergence of the state error can be guaranteed. As contraction analysis 

studies the evolution of an infinitesimal discrepancy between neighboring trajectories, 
and thus stems from a differential framework, the sufficient convergence conditions are 
different from the ones that previously appeared in the literature, which were derived 
^0 1 in a Lyapunov framework. This article sheds another light on the theoretical properties 

^ | of this popular observer. 
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■ 1 Introduction 

\ Since the seminal work of Kalman and Bucy [S] and Luenberger [TJ] , the problem of building 

observers for deterministic linear systems has been laid on firm theoretical ground. Yet, when 
the system is nonlinear, there is no general methods to tackle observer design. Over the last 
decades, nonlinear observer design has been an active field of research, and several methods 
have emerged for attacking some specific nonlinearities. In the engineering world, the most 
popular method is the so-called Extended Kalman Filter (EKF), a natural extension of the 
Kalman filter. The principle is to linearize the system around the trusted (i.e. estimated) 
trajectory of the system, build a Kalman filter for this time- varying linear model, and imple- 
ment it on the nonlinear model. The EKF is known to yield good results in practice when 
the guess on the initial state is close enough to the actual state, but possesses no guarantee 
of convergence in the general case, and indeed can diverge. 

Since the 1990's, several papers have addressed the convergence properties of the EKF 
viewed as a deterministic observer. Several conditions under which the estimation error 
converges to zero have been derived in, e.g., [TJ [THl [H [13]. In each case, a first set of 
conditions on the observability and controllability of the system ensures the boundedness of 
the solution of the Riccati equation and of its inverse, and a second set of conditions ensure 
in this case the convergence of the estimation error to zero. Roughly speaking, the latter 
conditions require either the initial estimation error to be small, proving the EKF is a local 
observer, or the system to be very weakly nonlinear. 
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In this paper, the convergence properties of the EKF are studied using contraction theory 
[12] and in particular the notion of virtual systems [17] and virtual observers [8] . Historically, 
ideas closely related to contraction can be traced back to [HI [7] H] (see e.g. [13] for a more 
exhaustive list of references). In the present case the idea is as follows: instead of studying 
directly the evolution of the discrepency, in the sense of a Lyapunov function, between the 
estimated state and the true state, contraction theory allows to study the evolution of the 
discrepancy between two nearby trajectories of the EKF, in the sense of a given metric. It 
is shown that, in a finite region and under some conditions, two nearby trajectories tend 
exponentially towards each other. As a result, the EKF is a dynamical system which expo- 
nentially forgets its initial condition, a very desirable property for a filter. The fact that the 
estimation error tends exponentially to the true state appears then as a mere consequence 
of the contraction properties of the filter. Even though the Lyapunov approach and the 
contraction approach are based on very similar metrics, the convergence results obtained in 
this paper differ from those of the literature. 

The main contributions of this paper are threefold. First, the paper studies the stability 
properties of the EKF from the perspective of contraction theory. This offers an alternative 
viewpoint to the usual Lyapunov approach, extending the preliminary results on linear time- 
varying systems of [5] . In turn, this perspective allows simple new convergence results to be 
derived in this context (see in particular Theorem 1 and its corollary). Finally, some of the 
results are closely related to existing recent literature, showing both their similarities and 
their potential strengths. 

The paper is organized as follows. In Section [2] some new general contraction results are 
derived. Section [3] builds upon those results to derive bounds on the size of the contraction 
region and the convergence rate of the EKF. Finally, Section |4] discusses some links with 
previous work on the stability of the EKF. 



2 General contraction results 

Consider the following nonlinear deterministic system 

j f x = f(x,t) (1) 

y m = h(x,t) (2) 

where x € R ra is the state, y m € R p is the measured output, and /, h are smooth. The EKF 
equations are given by 

jx = f(x, t) - PC(x, t) T R-\h(x, t) - y m (t)) (3) 

= A(x, t)P(t) + P(t)A(x, t) T + Q- P(t)C(x, tfR^Cix, t)P(t) (4) 

dt 

where A(x 1 t) — ||(a;,t), and C(x,t) — ||(x,i). In the stochastic theory of Kalman linear 
filtering, Q and R represent the covariances of the respectively drift noise and measurement 
noise. In a deterministic and nonlinear setting as the one considered in the present paper, they 
can be viewed more prosaically as design parameters where Q _1 represents the confidence in 
the trusted model ([T]) and the confidence in the measurements ([2]). The present analysis 
relies on the following assumption. 

Assumption 1 From now on we will systematically assume there exist p,p > such that 
pi < P(t) < pi. Moreover, for simplicity we assume that Q is fixed and invertible, and we 
denote by q its smallest eigenvalue. 
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The latter assumption on P(t) appears in most papers dealing with the stability of the 
EKF, e.g. [U [16]. It is well known that this assumption is verified as soon as the system 
£ = A(x 77 = C(x(t),t)^ is uniformly detectable. This is of course a very strong 
prerequisite on the behavior of the filter. Yet, note that this assumption can advantageously 
be checked by the user without any knowledge of the true trajectory. To the authors' best 
knowledge, very few papers have addressed the stability of the EKF without referring to 
Assumption 1: see |10) (and more generally high gain observers techniques [5]) where local 
convergence results are derived under some different, yet rather restrictive, assumptions. 

Let K(t) — PC(x 1 t) T R^ 1 denote the Kalman gain. Consider the "virtual" system [5] 

j t z = f(z,t)-K(t)(h(z,t)-y m (t)) (5) 

The solution x(t) of the true system ([1]) is a particular solution of the virtual system, since 
for all t > we have h(x(t),t) - y m {t) = 0. The solution to the EKF equations ©-(g]) is 
obviously another particular solution of the virtual system. As a result, if it can be proven 
the distance between two arbitrary trajectories of this system tends to zero, the convergence 
of the estimation error x — x to zero will follow. In turn, this can be achieved by seeking 
conditions under which the virtual system ([5]) is contracting. 

Let us define a metric for the virtual system by choosing, similarly to the linear 
time- varying case considered in [8], the squared length 

\\Sz\\%-i=6z T P- 1 5z (6) 

where P(t) is a solution of the Riccati equation (j4|) associated to the trajectory of the esti- 
mated state x(t). We have 

^(dz T p- 1 Sz) = {^-5z) t P- 1 5z + 8zP- x 5z + SzP^^Sz) 
dt dt dt 

= Sz T [{A(z, t) - K{t)C{z, tjfp- 1 + P- 1 
+ p- 1 (A(z,t) - K(t)C(z,t))]5z 

Using the fact that P~ x = —P~ 1 PP~ 1 where P is given by ((4]), and that 

[C{z,t) - CfaiflRr^Cfat) - C(x,t)} 

= C T (z, t)i?- 1 C(z, t) - C T (x, t)R- l C{z, t) - C T (z, t)R- l C{x, t) + C T (x, t)R- x C(x, t) 

(7) 

we finally have 

— (Sz T p- 1 5z) = 5z t P- 1 MP- 1 Sz (8) 
dt 

where M = PA T + AP + PC T R^CP - P^R^CP - Q, where we let A(z, t) = A{z, t) - 
A{x 1 t) and C{z 1 t) = C(z,t) — C(x,t), and where C denotes the matrix C(z,t). 

2.1 Main result 

Given two symmetric matrices Pi , P2 we define a partial order letting Pi < P2 if P2 — Pi is 
positive semidefinite. We have the following preliminary result: 

Lemma 1 Let < 7 < q/(2p). For each time t > there exists r(t) > such that for all z 
satisfying \\z — x(t)\\ < r(t) we have 

PA T + AP + PC T RT l CP <Q- 2 7 P + PG T RT 1 CP (9) 
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Proof The inequality is obviously verified for z = x as the right member of ([9]) is a positive 
definite matrix. As /, h are smooth, the inequality holds in a neighborhood of x. 



At any time, the vectors z lying within a distance at most r(t) of x(t) are contained in 
the contraction region, as for those vectors equality (jHJ becomes the contraction inequality 



This means the squared distance in the sense of metric ([6]) between two neighboring trajec- 
tories in this ball will tend to reduce, with a rate of change 7. This leads to the following 
general result: 

Theorem 1 Assume there exists < p = inf{r(t), t > 0}. Any trajectory of the system 
([5]) that starts in the ball of center x(0) and constant radius pj \ff> with respect to the metric 
([6]) remains in a ball of radius p/y/p centered at the trajectory x(t) of the Extended Kalman 
Filter ([3]) -(11]), and converges exponentially to this trajectory in the sense of the metric ([6]) 
with a time constant 1/7 for the exponential decay. 

Mathematically, the theorem's result can be expressed as follows. Let d P -i be the geodesic 
distance associated with the metric ©. Let zi(t), z 2 (t) be the flows associated to the system 
([5]) with initial conditions satisfying dp-i (zj(0), x(0)) < pj \fp for i = 1,2. Then for all times 
t > 



Proof The theorem is a straightforward application of the Theorem 2 of [T2] which states 
that any trajectory which starts in a ball of constant radius with respect to the metric 
centered at a given trajectory and at all times in the contraction region with respect to the 
metric, remains in that ball and converges exponentially to this trajectory, which is a natural 
result in the theory of contracting flows (see e.g. [11]). Indeed, (z — x) T P~ 1 (z — x) < p 2 /p 
implies \\z — x\\ < p so z is contained in the contraction region by Lemma 1. 

Remark 1 Note that, if the system is linear, we have A{z, t) = C[z, t) = Q for all z, t, and we 
recover the fact that the deterministic Kalman filter for linear systems globally exponentially 
converges under Assumption 1. 

2.2 Particular case of a linear output map 

The theorem implies an interesting result in the common case of linear output maps. In 
many nonlinear systems of engineering interest, the system output consists of an incomplete 
measurement of the state vector. For instance, the output can be temperature or concentra- 
tions in chemical reactors, currents in induction machines, position or velocity in mechanical 
systems. Formally, this means that the output map is linear, i.e. h(x) = Cx, implying 
C{z, t) = for all z, t. Let then X max (-) denote the largest eigenvalue of a symmetric matrix, 
and let 7 > 0. The following result is an immediate consequence of Theorem 1. 

Corollary 1 Assume that the output map is linear, and that X max (A(z, t)P(t)+P(t)A(z, t) T ) < 
q — 2jp for all z,t. Then the Extended Kalman Filter J3])-(|4]) is globally exponentially con- 
vergent with a time constant 1 /j for the exponential decay. 

This result shows that contraction analysis can yield new types of conditions under which 
exponential convergence of the EKF is guaranteed. Indeed, under the assumptions of Corol- 
lary [U the EKF will converge globally without the standard requirement that the Hessian of 
the coordinates of / is uniformly bounded. However, in a more general context, this standard 
requirement will still be needed as illustrated in the next section. 




x 6z) < -2 1 {5z T p- 1 5z) 



d P -i( Zl {t),z 2 (t)) <e ^ P -i(z 1 (0),z 2 (0)) 



(10) 
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3 A sufficient condition for exponential convergence 



We now derive a lower bound on the size of the contraction region of Theorem 1. This 
result relies on usual assumptions on boundedness of second derivatives , around the 
observer trajectory x(t), t>0. We let ||-|| and ||| • ||| denote the norms on resp. matrices and 
tensors induced by the Euclidean norm on vectors. 



Assumption 2 There arc positive numbers a, ka, kc such that for all z satisfying ||ai— z|| < 
a and all t > we have 10OM)||| < *A and [0 < K C - 

Under Assumptions 1 and 2, one can derive the following local exponential stability result: 



Corollary 2 Let 7 < q/(2p). Let £ + be the positive root of the equation 

v 2 

^4C 2 + 2p^C - (£ - 2 7 p) = 

Let p = min(a, C + )- Any trajectory of the system ([5]) that starts in the ball of center x(0) and 
constant radius pj\ff> with respect to the metric ([6|) remains in a ball of radius pj\ff> centered 
at the trajectory x(t) of the Extended Kalman Filter <(3j) -((4j) , and converges exponentially in 
the sense of the metric (JB]) with a time constant 1 /j for the exponential decay. In particular, 
this implies in the Euclidean metric on vectors, that 

\\x(t) - x(t)\\ < ^ffp ||i(0) - x(0)|| exp" 7 * (11) 

for all t > as soon as initially \\x(0) — x(0)\\ < p^Jpj \ff>. 

Proof The result directly follows from Theorem [TJ as long as one can prove that p < r(t) 
for all t > 0, where r(t) is the radius of a ball around x(t) in which the matrix inequality ([9]) 
is verified. To begin, note that the ball of center x and radius £ + is equivalently defined as 
the set 



b 2 

{z e R", 2p KA \\x - z\\ + — 4P - z\\ 2 < q - 2 7 j5} (12) 



r 

Let e = z - x. As A(z, t) = A(z, t) - A(x, t) = J* |^ (x + re, t)e dr and C(z, t) = C(z, t) - 
C(x,t) — J Q j^{x + re,t)e dr, we have ||A|| < ka\\x — z\\ and \\C\\ < Kc\\x—z\\. The largest 
value of the symmetric matrix AP + PA T satisfies 

\ ma x{AP + PA T ) = max w T (AP + PA T )w 
ll^ll— 1 

= 2 max Tr&ceiuF APw) 

\\w\\ = l 

< 2p max w T Aw 

Hl=i 

< 2p max \w T Av\ < 2p\\A\\ < 2pK A \\x - z\\ 

||i»||=i,||t>||=i 

In the same way we have \ max (PC T R~ l CP) < ^-KqWx — z\\ 2 . Thus, as long as z belongs 
to the set (TT2)) . the inequality (HJ) is satisfied. 
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4 Links with previous work in the literature 



The Extended Kalman filter has been shown to converge locally exponentially under a set of 
conditions on the nonlinearities of the system, see e.g. |15[ [5] in continuous-time and |16[ [^l[T] 
in discrete-time. In these papers, the convergence analysis is based on the Lyapunov function 

V(x-x,t) = (x-x) T p- 1 (t)(x-x) (13) 

At fixed time t, V{x — x, i) is the geodesic distance between x and x for the proposed metric 

||fc||f,-i = Sz t P- 1 Sz 

Thus, it is no surprise that the convergence conditions derived in this article are very similar 
to those previously appearing in the literature. That said, the specificity of the contraction 
framework yields some differences that we shall detail in this section, which is organized as 
follows. In Subsection 14.11 we emphasize a difference of point of view between contraction 
analysis and Lyapunov based convergence analysis. In Subsection 14. 2[ we compare the con- 
vergence rate and basin of attraction of Section[3]with the results appearing previously in the 
literature. For fair comparison we chose the article [15 which deals with the continuous-time 
case. Finally, in Subsection 14. 3[ the modification of the EKF proposed in [15] is analyzed 
in the light of contraction theory, yielding a simple generalization of this work which is 
straightforward in our framework. 

4.1 A different approach 

In standard Lyapunov analysis of nonlinear deterministic observers, one generally seeks to 
prove the state error, i.e., the discrepancy between the estimated trajectory £ and the true 
trajectory x, measured by some Lyapunov function, tends to zero. Often the observers can be 
only proved to be locally convergent, i.e., the initial guess i(0) must belong to some attraction 
basin containing x(0). The contraction analysis of the present paper, based on the idea of 
virtual systems [17] and specifically virtual observers [8], builds upon a different approach. 
Indeed, the idea is to focus on a particular trajectory of the filter x(t), t > 0, and then to 
study the evolution of two neighboring trajectories of the virtual system (0, and prove that 
the distance between them tends to shrink over the time. Under a set of conditions, we have 
proved that this property holds in a ball centered at the particular trajectory x(t), t>0. 

The contraction results underlying Theorem [1] and its corollaries are natural properties to 
expect from any filter. Indeed, they prove that as a dynamical system the observer possesses 
stability properties, and this independently from the behavior of the true system, or modeling 
errors. To be more specific, it is proved two initially close enough trajectories of the filter 
equation will verify equation ([TO]) . This exponential forgetting of the initial condition is thus 
a very desirable property for filters, which indicates robustness with respect to bad guesses 
and perturbations. This property is valid whether the true trajectory x(t) belongs to the 
contraction set or not. 

Besides, contraction theory provides a concrete result on the robustness of the EKF 
against external perturbations. Suppose indeed that x p (t) is some trajectory of a perturbed 
virtual system -^x p = f(x p ,t) — K(t)(h(x p ,t) — y m (t)) + b(x p ,t) where b(x p ,t) represents a 
disturbance whose norm is supposed to be uniformly bounded by, say, ||6|| ma x- Then we have 

(see [HIS]): IM*) - < J(P/p) (e~* \\x p (Q) - x(0)\\ +7||6|| max ). It proves that any 

trajectory of the perturbed system converges exponentially to a ball of radius J (p/p)7||6||max 
around the observer trajectory, allowing to evaluate the estimation error generated by the 
perturbation. The latter result completes the robustness properties of the EKF. 
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Convergence rate 


Attraction basin for kc = 


Attraction basin for ka = 


Lyapunov 


7 = gp/(4p 2 ) 


H0)-£(0)|| < (p/p) [q/{iKAP)] 


||x(0) - :r(0)|| < qr/^cncV 2 ) 


Contraction 


7 = q/ (4p) 


HO) - x (0)|| < (p/WWiiKAP)] 


||a;(0)-i(0)|| < (qpr) 1 /2/( Kc p3/2 V2) 



Table 1: Comparison of convergence rate and attraction basin. 



4.2 Comparison of convergence results 

First of all, to the authors' knowledge Theorem 1 and its Corollary 1 have never appeared 
in the literature. The common approach is to study the evolution of the Lyapunov function 
(|13p over time. While similar results to Theorem 1 and its corollary may possibly be worked 
out from this approach as well, they appear quite naturally in a contraction framework. 

Consider now the results of Section [3] As already mentioned, the standard Lyapunov 
function (|13[) represents the geodesic distance between x and x in the sense of the metric 
proposed in this paper. This is why the bounds derived in Section [2] are very similar to 
those previously obtained in the literature. However, they are different, and this is mainly 
due to the use of equation ([7]) in the result (jHJ. Again, the analog of this transformation in 
the Lyapunov framework is not easily seen, whereas it appears naturally in the contraction 
framework. 

Assume for simplicity that a = +oo. Table [1] compares the convergence rate and attrac- 
tion basin obtained in [15] and the ones obtained in the present paper in the two limiting 
cases ka = and kc — 0. The results all correspond to the error equation ([TT]) in the Eu- 
clidean metric. The rates and bounds in the Lyapunov approach of [15] have been obtained 
letting a = (this parameter has a different meaning in this article) and using the definition 
of k derived therein. 

We see that the results are quite similar, yet different. First of all, we immediatly see 
that both the convergence rate and the size of the guaranteed attraction basin are larger in 
our approach in the limiting case kc = 0, improving the formerly obtained bounds of |15j . 
Indeed 7 given in Tabled] is greater in the contraction case by a factor p/p > 1 (yielding faster 
convergence), and the size of the attraction basin by a factor (p/p) 1 / 2 > 1. The difference is 
much more remarkable in the limit case ka = 0, in which case the size of the attraction basin 
depends heavily on the problem parameters. In particular, a noticeable difference is that the 
size of the attraction basin in [TS] depends on c, an upper bound on {||C(f)||, t > 0}. As 
a result, a large linearized observation matrix C(t) can diminish the guaranteed size of the 
attraction basin. On the other hand, the bounds obtained in the present paper do not rely 
on an upper bound c, and therefore they may yield stronger guarantees in some cases. 

4.3 A contraction-based interpretation of the observer of [15] 

In [T5], the authors propose to modify slightly the EKF by adding a new term 2/3 P in the 
Riccati equation with j3 > 0, leading to a modified Riccati equation: 

= A(x, t)P(t) + PA(x, tf + Q- PC(x, tfR^Cix, t)P + 2/3P 

dt 

They prove the addition of such a term yields a faster convergence rate, as long as Assumption 
1 is preserved. Note that, such a term tends to increase the eigenvalues of P and thus to 
destabilize the Riccati equation. The fact that Assumption 1 remains then valid is thus non 
trivial and must be checked. Then, as long as Assumption 1 is proved to hold, this term 
ensures a faster convergence rate. This latter fact is easily understood in the contraction 
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framework. Indeed, with this additional term, inequality ([9]) becomes: 



PA T + AP + PC T Rr 1 CP < Q - 2(7 + P)P + PC T Rr 1 CP 



The benefits of this term are now obvious as it transforms the convergence rate 7 into 7 + f3. 
In fact, contraction theory even offers a direct generalization of the work of [15j . Indeed 
consider the following modified EKF observer 



where N is any positive definite matrix. It follows directly from equation ^ that as long 
as Assumption 1 still holds, the guaranteed convergence rate of this observer is increased by 
adding n/p, where n denotes the smallest eigenvalue of N. 
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